Instabilities in complex mixtures with a large number of components 
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Inside living cells are complex mixtures of thousands of components. It is hopeless to try to 
characterise all the individual interactions in these mixtures. Thus, we develop a statistical approach 
to approximating them, and examine the conditions under which the mixtures phase separate. The 
approach approximates the matrix of second virial coefficients of the mixture by a random matrix, 
and determines the stability of the mixture from the spectrum of such random matrices. 
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Mixtures are not always simple, well-characterised and 
made up of 2 or 3 components. The mixtures of bio- 
macromolecules inside living organisms contain thou- 
sands of different macromolecules, and the oil extracted 
from wells by the petroleum industry may also contain 
many different hydrocarbons and related compounds. 
This gives us two main problems: i) the number of com- 
ponents is so large that the data we have is inadequate to 
characterise all the components, ii) even if we were able 
to precisely characterise each component then compre- 
hending, and calculating with, this mass of data would 
be difficult. The sheer complexity of the mixture over- 
whelms our ability to comprehend the mixture and to 
predict its properties. An analogous problem afflicted 
nuclear physics 50 years ago. Large nuclei, such as ^^^U, 
are complex many-body systems with complex spectra. 
Nuclear physicists were faced with energy spectra with 
so many energy levels that comprehending and predict- 
ingthem directly was impossible. Starting with Wigner 
QMIEQi they resorted to statistical methods, and re- 
placed the complex and unknown Hamiltonian matrix 
of a nucleus with a Hamiltonian matrix whose elements 
were random variables. A drastic approximation but one 
which worked. They were able to reproduce the statisti- 
cal properties of energy spectra. By statistical properties, 
we mean properties such as the probability distribution 
function of these spacings. Subsequently, random matri- 
ces have been applied in many areas of physics |^, |^ . 

Here, we will apply an analogous statistical theory to 
complex mixtures. We start by noting that, at the sim- 
plest level, the interactions between two components i 
and j affect the free energy, and hence potentially drive 
a phase transition, via their second virial coefficient Bij. 
These second virial coefficients form a symmetric matrix 
of course, and the eigenvalues of this matrix describe the 
change in the excess free energy when the density is per- 
turbed. For an TV 3> 1 component mixture we need a 
huge number, of order 7V^, virial coefficients to specify 
the mixture, but typically we are not interested in know- 
ing each individual value but we are interested in how 
the system reacts to density perturbations, because if 
the free energy change when the density is perturbed be- 



comes negative, the system is unstable and will undergo a 
phase transition. The parallels between our situation and 
that faced by nuclear physicists 50 years ago are obvious 
and so we adopt their solution: we replace the matrix 
of, unknown, second virial coefficients of some mixture, 
by a random matrix. For large iV, the eigenvalues of this 
matrix depend on two variables only: the mean and stan- 
dard deviation of its elements, which we can vary to fit 
a specific mixture or just vary to explore the generic fea- 
tures of the phase behaviour of such a mixture. Once we 
are using random matrices, the fact that we have N ^ 1 
components is a help not a hindrance. 

The Helmholtz free energy per unit volume, /, of an 
N component mixture, truncated after the second-virial 
coefficient terms is 



N ^ N N 



(1) 



i=l j = l 



where pi is the number density of component i. The 
N densities form a row matrix p — [pi p2 - ■ - Pn)- We 
use units such that the thermal energy ksT — 1. Here, 
we will not calculate complete phase diagrams with the 
densities and compositions of coexisting phases. We will 
calculate the limits of stability, spinodals, where the sys- 
tem becomes unstable with respect to density perturba- 
tions. Thus, we will only be able to determine qualitative 
features of the phase behaviour, such as whether or not 
a phase transition occurs and whether the transition is 
demixing of the components or their condensation. Sta- 
bility of the mixture requires that / be convex. Con- 
vexity requires that the second order term, in an 
expansion of / in powers of 6p, be positive for any small 
perturbation Sp. For our N component mixtures dp is a 
row matrix of length TV. Now, f is given by 



S'f = lsp{P + B)6p'' 



(2) 



where B is the matrix of second virial coefficients and P 
is a diagonal matrix with the ith diagonal element equal 
to l/pf. it contains the ideal or perfect gas contributions 
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to the free energy change. Of course, any 5p can be ex- 
pressed as sum of eigenvectors of P + B and so the require- 
ment (5^/ > imphes that all eigenvalues of P-l-B must be 
positive. The mixture becomes locally unstable when an 
eigenvalue, the lowest obviously, becomes zero. Stability 
is determined only by the lowest eigenvalue. However, if 
each component has more-or-less the same mobility, then 
the decay of small density modulations can be described 
as the decay of the set of eigenvectors of P -I- B, with 
each eigenvector component of the density modulation 
decaying at a rate proportional to its eigenvalue. 

The eigenvalues of P + B form the row matrix 7. 
For simplicity we will assume that in the mixture all 
components are present in equal amounts pi = pt/N, 
i = 1,...,N, where px is the total density. Then 
P = {N/pt)\, where I is the iV by TV unit or identity 
matrix, and the eigenvectors of P -I- B are equal to those 
of B. The eigenvalues of P + B, are those of B, which 
form a row matrix A, shifted by N/px, i.e.. 
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7 = A + {N/pt)i 



(3) 



FIG. 1: The probability density function for the eigenvalues 
of B, p{X). The 3 curves are, from bottom to top, for b/a = 
— 1, and 1. The curve for hja = —1 is shifted down by 0.05, 
and that for fe/a = 1 is shifted up by 0.05, as otherwise their 
central semicircular parts are almost on top of each other. 



where u = (1, . . . , 1). 

Stability is determined by the sign of the lowest eigen- 
value of P -I- B, 7inin- It requires 7min > 0, and the 
spinodal is reached when 7min = 0. From Eq. Q 
it follows that the total density pT at the spinodal is 
Psp = — -/V/Aniin, with Xmin thc lowcst eigenvalue of B. 

Now, B is a random matrix, and for simplicity we 
choose its elements Bij, i < j, as independent random 
variables with mean b and standard deviation a (we ac- 
tually choose a Gaussian distribution, but the particular 
choice of this distribution is irrelevant for large N). Since 
Wigner's pioneering work the problem of character- 
ising the spectrum of such a random matrix has received 
a great deal of attention (see 6] for a review) . In our par- 
ticular case, there are two theorems that fully describe 
the spectrum of B. 

The first theorem is due to Arnold 0, 1^ , and states 
that the density of rescaled eigenvalues, x = \/2aN^/^, 
of B converges in probability, as iV ^ 00, to 



W{x) = 



if Ixl <1, 
if \x\ > 1. 



(4) 



This is known in the literature as Wigner's semicircle law. 
It also holds if the diagonal elements, Ba, have a mean 
b' ^ b. 

The second theorem, by Fiiredi and Komlos 3 1 refines 
this result a little. Under the further assumption that 
\Bij \ < K for alii, j ^ 1,. ..,N,ifb > {b < 0) then the 
highest (lowest) eigenvalue is asymptotically distributed 
with a Gaussian of mean Nb + (&' -b) + a^/b + 0{N-^l'^) 
and variance 2(t^, and the remaining — 1 eigenvalues 
follow Wigner's semicircle law. If & = the highest (low- 
est) eigenvalue is also within the semicircle. 



All this can be seen in numerically calculated spectra, 
for N = 25, in Fig. d For |&| > a/N^^'^ the probabihty 
density function, p(A), for the eigenvalues clearly exhibits 
a lone, Gaussian distributed, eigenvalue, and for all b 
there is a clear semicircle. 

The two theorems above permit us to describe, for 
large N, the lowest eigenvalue of B and hence the spin- 
odal instability of our mixture. Let us define rescaled 
variables /? = N^/^b/a and ^sp = Pspd/N^^^- For 
/3 < — 1, the (rescaled) lowest eigenvalue, Xmin, which 
determines the spinodal is due to a lone eigenvalue, see 
the bottom curve in Fig. ^ This eigenvalue has a mean 
value {P+(3~^)/2 and standard deviation l/^/2N. As the 
latter goes to zero when A^ self-averaging 
quantity. So for large A^ we can take |lO| 



(fsp) = 



-1 



for the spinodal. Replacing Xmin by its mean value, 



(?sp) 



-1 



/3 + r 



(5) 



(6) 



The nature of the instability is described by the corre- 
sponding eigenvector. Fiiredi and Komlos show that this 
eigenvector is almost parallel to u ;9j, so it is a conden- 
sation instability, as the densities of all the components 
increase (or decrease) together according to the eigenvec- 
tor: the instability looks like the incipient formation of 
one phase enriched in all the components coexisting with 
a phase depleted in all the components. 

Conversely, if /3 > —1, then Xmin and hence thc spin- 
odal, is determined by the semicircle. The lowest eigen- 
value will be near the lower end of the semicircle. We 
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FIG. 2: The mean reduced total density at the spinodal, 
(^ap) = {psp)o / N^^'^ , as a function of the reduced mean sec- 
ond virial coefficient, {3 = N^^'^h/a. The solid curves are 
the numerically calculated mean density of the spinodal for 
N = 25, 100 and 400 (from top to bottom). The short-dashed 
horizontal lines are the predictions of Eq. IIH for these val- 
ues of A'^ (again from top to bottom). The N oo limit of 
Eq. @ for /3 < -1, and of Eq. JTTJ ((?sp) = 1/2) for /3 > -1, 
is plotted as a long-dashed curve. For /3 < —1, it lies on top 
of the numerically calculated values and so is not visible. The 
crossover between condensation and demixing at /3 = — 1 is 
marked by a vertical dotted line. 



then estimate the distribution of the lowest eigenvalue as 
follows. 

Given the semicircle law, the expected number of 
(rescaled) eigenvalues between —1 and X will be given 
by N W{x) dx, so Xnun will be in the interval 
[-l,X{N)], where X{N) is defined by 



fXiN) 

N I W{x)dx 



1. 



Using Eq. Q), we obtain an equation for X 
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x + -xVi-x^ = ^, 



which yields 



X{N) = -1 + m + -^m^ + + ©(m"*). 



(7) 



(8) 



(9) 



with m = (l/2)(37r/2A^)2/3. Now, for a given (large) 
s^inin will be roughly roughly roughly roughly roughly 
roughly roughly roughly distributed according to 



PN (Xruin) 



NW{x^in) if-l<Xnun<X{N), 

otherwise. 



(10) 



(Notice that J_^pi\f{x) dx = 1, so it is a well defined 
probability density.) As X{N) — > —1 when N ^ oo, 



FIG. 3: The mean (solid curve) of the absolute value of 
the cosine of the angle 8 between the eigenvector with the 
smallest eigenvalue, and the vector with all its elements equal 
to 1, (I cos(0)|), and its standard deviation (dashed curve), as 
a function of the reduced mean second virial coefficient, j3, 



Xmin is self-averaging in this case as well. We can then 
make use of Eq. |SJl to determine the spinodal. Thus from 
Eqs. ® and lini), 
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0{m^). 



(11) 



As for the eigenvector, we know that when there is a 
lone eigenvalue its corresponding eigenvector is almost 
parallel to u, so the eigenvector of any eigenvalue of the 
semicircle must be almost orthogonal to u (the matrix 
is symmetric). By continuity this holds for aJmin even 
when there is no lone eigenvalue. Then, roughly half of 
the components of the eigenvector are of one sign while 
the rest are of the opposite sign. The instability is with 
respect to a density modulation in which about half the 
components are separating from the other half: the in- 
stability looks like demixing into two phases, each one 
enriched in some components and depleted in others. 

The predictions of theory in both regimes, Eqs. 10 and 
(|ll|l . are compared with the results of numerical calcula- 
tions in Fig. [1 for = 25, 100 and 400. The crossover 
between demixing and condensation occurs for /3 ~ — 1 
in all cases. This crossover can also be seen by looking 
at the angle, 0, the eigenvector of Xmin makes with u. 
In Fig. 131 we have plotted the mean and standard de- 
viation of I cos(6')|, as a function of (3, for A^ = 25. At 
around (3 = —1 the cosine drops and the standard devi- 
ation peaks, indicating that the instability eigenvector is 
switching over from being nearly parallel to u, to being 
nearly perpendicular. 

For /3<— 1, asA^^oo, the instability, which is with 
respect to condensation, approaches that of a single com- 
ponent system with second virial coefficient b. Such a sys- 
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tem becomes unstable at a spinodal density pspb = — 1, 
and Eq. ((BJ yields this result as ^ oo. For /3 > — 1, 
(Csp) — ^1/2, asA^^oo, and so the total number density 
at the spinodal diverges as N^/^. The mixture becomes 
stable with respect to demixing at all densities as TV — > oo 
— also consistent with the mixture behaving as a single 
component system. For finite but large N, demixing re- 
quires either high densities or a broad distribution of the 
second virial coefficients, i.e., large a. In Fig. El we see 
that even for the smallest N of 25, the prediction of our 
theory for demixing (/3 > —1) is quite accurate. We also 
see that for condensation, our theoretical prediction is on 
top of the numerical curves for all 3 values of N, and that 
the density of the condensation instability is insensitive 
to N. 

Many properties of random matrices, such as the dis- 
tribution of their eigenvalues (properly scaled), are self- 
averaging 0- A property f? of an A'' by matrix A at is 
said to be self-averaging if R{An) converges in probabil- 
ity, as A^ ^ oo, to {R{An)), where () denotes an average 
over some ensemble of matrices A^. This simply means 
that, for large A^, almost all matrices Ajv have the same 
value of R. In particular, the lowest (scaled) eigenvalue 
Xmin of our random matrices B is a self-averaging prop- 
erty of these matrices and hence so is the value of the 
density at the spinodal. The distribution function of the 
spacing of the eigenvalues of random matrices is also self- 
averaging, and experimental data on nuclei show that the 
spectra of nuclei far from their ground state are, approx- 
imately, self-averaging 0, 0| . The good agreement be- 
tween theory and experiment in the study of nuclei relies 
on both the model and the experimental system having 
a self-averagingproperty. As emphasised by Dyson and 
others 0, 13^0, |3 > the theoretical prediction is obtained 
via statistical methods but this is then compared to the 
results for a single experimental system, e.g., a ^ssy 
cleus. 

So, are the thermodynamic properties of complex 
mixtures, such as those found inside living cells, self- 
averaging? This can only be determined by experimental 
measurements of these properties, and we are not aware 
of data that could decide one way or the other. We can 
only say that if they are self-averaging, then statistical 
theories like that presented here will be an effective way 
of predicting and understanding their properties, while if 
they are not self-averaging then their properties will be 
sensitive to small details and very difficult to predict. 

The analysis we have just carried out in this Letter 
relies on several simplifying assumptions. We have em- 
ployed a second virial coefficient approximation, have 
studied a mixture of components with equal densities, 
and have taken the virial coefficients to be independent 
random variables. Correlations were neglected so that 
(BijBik) = {Bij)"^ j 7^ k. All three assumptions are sim- 
ple, minimal assumptions which can be relaxed in a more 
elaborate, but still of course statistical, model. 



Quite generally, to develop theories for very complex 
systems, specified by very large numbers of parameters, 
there seems little alternative to statistical approaches. 
By statistical approaches we mean those that have pa- 
rameters which instead of being definite numbers which 
are put into the model, are random variables taken from 
a probability distribution function which is put into the 
model. The cytoplasm of bacteria such as E. coli is 
a very complex system: it is a mixture of thousands 
of different types of rather complex bio-macromolecules, 
mostly protein but also RNA, DNA, polysaccharides, etc. 
[ill Ha . . We can neither obtain nor want details of 
all the interactions of these molecules, but we do want to 
understand and to be able to predict the collective prop- 
erties of this mixture, such as its osmotic pressure, where 
it becomes unstable and so on. In these circumstances, 
statistical approaches, such as the one described here, 
are the only means of making predictions. They rely on 
self-averaging occurring in the experimental system, and 
so, in order to establish their validity, we need to know 
whether or not the complex mixtures found inside liv- 
ing cells have self-averaging thermodynamic properties, 
something that experiments will have to assess. 
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